Skip to content

Exclude system tables and only restore user's schema#486

Open
kendrick-ren wants to merge 1 commit intoscylladb:masterfrom
kendrick-ren:master
Open

Exclude system tables and only restore user's schema#486
kendrick-ren wants to merge 1 commit intoscylladb:masterfrom
kendrick-ren:master

Conversation

@kendrick-ren
Copy link

@kendrick-ren kendrick-ren commented Aug 21, 2025

This PR excludes system tables from the restore process, and only restores user's schema with scylla manager restore --restore-schema command.

  1. to exclude the system tables from the variable "tables_to_restore", so that the playbook only refresh the user's tables with the items listed in this variable
  2. added the step "Restore schema from a backup snapshot" to restore user's schema with scylla manager restore --restore-schema command.
  3. changed prerequisite of README.md
  4. changed vars.yaml.example by adding scylla_manager_cluster_name and changed hosts.example inventory file by adding scyllamgr_host and splitting 2 sections

@kendrick-ren kendrick-ren marked this pull request as draft August 21, 2025 18:26
@kendrick-ren kendrick-ren marked this pull request as ready for review August 21, 2025 18:29
@vladzcloudius
Copy link
Collaborator

  1. Exclude system tables and only restore user's schema

    1. Take out the step of cleaning up the directories(data/commit log/hint/...)

    2. Use scylla manager command to restore the schema

    3. Add related variables

Other new features (like change the tombstone_gc setting, adjust playbook structure to ensure cleanup to run, etc) will be added with another PR later

@kendrick-ren, please, split the patch into patches that change one thing, e.g. "Exclude system tables and only restore user's schema".

Also, please, provide a meaningful and more detailed than in the current patch (don't confuse with the PR description).

I'll also leave a few comments in the patch itself.

Copy link
Collaborator

@vladzcloudius vladzcloudius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As promised - there are more comments.

tags:
- restore_token_ring
tasks:
- name: Stop all nodes serially
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nodes don't need to be stopped serially. Stopping nodes serially is only going to make this step take longer for no reason AFAICT.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, this is a mistake. Let me move the step of stopping nodes out of this play.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Made change to stop all nodes in parallel in line 46 ~ 56.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test post

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test post

test

tags:
- upload_snapshot
tasks:
- name: Restore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of moving this block to be before the restoration of the original cluster's seeds configuration?

The original structure of the playbook that was, first, setting up the destination cluster (all the way) and only then restoring the data makes more sense the proposed version when the data restoration is stuck in between two parts of the cluster configuration steps.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intension is to make the cleanup as an always run action, so even the restoration or any previous step fails in between somehow, the cluster would still be able to come back.
While the actual change to make it always action comes as a new feature in the next PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Leave the structure as it was before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not restore the original structure - the "block" and the corresponding indentation are still part of the first commit.
Please, remove.

# Cluster name to restore schema to. This is the name of the target cluster which
# is registered in ScyllaDB Manager
#
cluster_name: <target_cluster_name>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster_name is a mandatory value that must appear in scylla.yaml on every Scylla node.
Giving a variable that has a different semantics the same name is very confusing.
Please, rename it to something like scylla_manager_cluster_name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, scylla_manager_cluster_name is better. Changed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, scylla_manager_cluster_name is better. Changed.

shell: |
scylla-manager-agent download-files -L {{ backup_location }} -n {{ host_id[inventory_hostname] }} -T {{ snapshot_tag }} -d /var/lib/scylla/data/ --dry-run | grep "^\s*\-" | cut -d"-" -f2 | cut -d"(" -f1
become_user: scylla
scylla-manager-agent download-files -L {{ backup_location }} -n {{ host_id[inventory_hostname] }} -T {{ snapshot_tag }} --dump-manifest 2>/dev/null | jq -r '.index[] | [.keyspace,.table] | join(".")'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hunk add a dependency on the 3d party jq tool.
Note that the old version on this command didn't have this requirement.
If you still believe that using jq is necessary, please, make sure to update the README.md by add adding the corresponding requirement to a "Prerequisites" section.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpancier-scylla any specific reason on switching to use jq? Can you help elaborate?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on the discussion today, we'll keep using jq and update the readme correspondingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Ansible has a native support for JSON.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the readme to include jq as one prerequisite in the README.md

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Ansible has a native support for JSON.

That is a good idea, let me convert the logic to use the built-in json parser instead of jq. Will submit another commit shortly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to Ansible built-in json parser, took out jq from the code and from README.md

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using the built-in Ansible JSON parser?

- name: Parse JSON from stdout
      ansible.builtin.set_fact:
        parsed_data: "{{ command_output.stdout | from_json }}"

# - name: Let's see our new facts
# debug:
# msg: "{{ inventory_hostname }} old seeds list is {{ old_seeds }}"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put any cleanup into a separate patch/commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this comment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluded from this PR. Use a separate patch to get rid of it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see it in the first commit of this PR.

gather_facts: false
serial: 1
tags:
- restore_token_ring
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you wipe old data/commitlog/schema_commitlog/etc. directories like the original playbook did Scylla nodes are not going to bootstrap hence won't pick up new tokens.

This whole change makes very little sense.
Did you test that the playbook with your changes actually works?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new change was validated in both sanity test and impact test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, restore to the original logic.

@kendrick-ren kendrick-ren changed the title restore users schema only and add related variables Exclude system tables and only restore user's schema Oct 21, 2025
@kendrick-ren
Copy link
Author

@vladzcloudius I think all comments are resolved. Please review and add new comments if any. Thanks.

@vladzcloudius
Copy link
Collaborator

@kendrick-ren, please, don't resolve other people's comments. This makes it hard to go over changes that have been requested.
Reply on these comments instead, e.g. "fixed".

# - name: Let's see our new facts
# debug:
# msg: "{{ inventory_hostname }} old seeds list is {{ old_seeds }}"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this comment?

@kendrick-ren
Copy link
Author

@vladzcloudius
example-playbooks/restore_scylla_manager_backup/restore.yaml

- name: Let's see our new facts

debug:

msg: "{{ inventory_hostname }} old seeds list is {{ old_seeds }}"

Member
@vladzcloudius vladzcloudius 32 minutes ago
What about this comment?

This would be resolved in another patch, instead of combining them into one, per your earlier another comment.

become_user: scylla
register: _tables_list
vars:
ansible_pipelining: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is pipelining needed?
According to Ansible documentation using pipelining conflicts with a privilege escalation (become) which you use in this task.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is to avoid Ansible to write a temp file under a temp directory for the step, while it only conflicts when "become" is used along with requiretty enabled (which is usually disabled for Ansible automation on systems).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we care if Ansible is going to write a temp file?

- name: Save tables names list as a fact
set_fact:
system_schema_tables: "{{ tables_to_restore | select('search', '^\\s*system_schema\\.') | list }}"
all_tables: "{{ _tables_list.stdout.split('\n') }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it: don't you already have it in _tables_list.stdout_lines ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I meant to use that, which was why I generated stdout_lines, but forgot to change the step after.

Fixed now.


- name: Save names of tables to restore as a fact
set_fact:
tables_to_restore: "{{ all_tables | reject('search', '^system(_|\\.)') | list }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about audit keyspace?

Copy link
Author

@kendrick-ren kendrick-ren Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

audit keyspace is also kept here, this variable is only used in node refresh step. In the steps later, audit keyspace table content is also restored.

port: 9042
host: "{{ listen_address }}"


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleanups should be part of a separate commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, isolated the cleanups related change to a separate commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see it in the first commit.

- restore_token_ring
tasks:
- name: Delete initial_token in scylla.yaml of {{ inventory_hostname }}
tags: cleanup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding tags (which allow you run specific "tagged" tasks) should put in a separate dedicated commit.

Copy link
Author

@kendrick-ren kendrick-ren Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, all cleanups related changes went into a separate commit.

path: /etc/scylla/scylla.yaml
regexp: '^initial_token:'
line: ""
state: absent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change in behavior - should sent as a separate dedicated commit with a description that has a motivation for a change.

Copy link
Author

@kendrick-ren kendrick-ren Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the purpose is to remove initial_token, state: absent will delete the line, while the original line: "" would leave a blank line (which is an extra blank line compared with user's very original scylla.yaml file).

It went to a separate commit.

Copy link
Collaborator

@vladzcloudius vladzcloudius Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, don't mix independent changes in the same commit: there should be one commit for tagging and one for this hunk.
Each commit with a corresponding description.

tasks:
- name: Restore ScyllaDB schema from backup
tags:
- restore_schema
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there are 2 different tags on the task of a play that has a single task?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good catch. Removed.

backrefs: yes

- name: Restore schema from a backup snapshot {{ snapshot_tag }}
hosts: all
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it: why do you run this play on hosts: all while in fact you are going to run a single command on a Scylla Manager host?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is inherited, and hosts:all is the default in Ansible any way.

But it would be overwritten by the "delegate_to" and "run_once" setting in the sub task under it.

Copy link
Collaborator

@vladzcloudius vladzcloudius Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are new lines - not inherited.
And you can use whatever host or a group of hosts with the hosts: ... parameter. Hence the question.

Here you were supposed to use scyllamgr_host as your hosts:

Something like (after you address the corresponding comment below):

...
hosts: scylla-manager
...

And drop all this delegate_to and run_once nonsense.

when: item not in system_schema_tables
- name: Restore
tags:
- restore_data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as before: why there are two tags for the same block of tasks?
And even if there is any reason for that - this must come in a separate commit.
In order to be able to set a tag on a block you had to put all those commands in a block and change their indentation which makes it hard to see what was the functional change in this commit that was supposed to only change the way we restore the schema.

Please, take all this syntactic sugar out of the functional commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, removed the unnecessary tag.

# - name: Let's see our new facts
# debug:
# msg: "{{ inventory_hostname }} old seeds list is {{ old_seeds }}"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see it in the first commit of this PR.

@kendrick-ren
Copy link
Author

#486 (comment)

@vladzcloudius Resolved.

Copy link
Collaborator

@vladzcloudius vladzcloudius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first patch description seems to be misleading.
There wasn't any issue with tables UUIDs on the source cluster in the original procedure AFAIK.

If there wasn't please, refer the corresponding GH.

AFAIU the change is required simply because the SM API has changed and there is a way to restore the schema without having to upload system KS data.

Please, reference the corresponding API change in the commit message too.

@kendrick-ren
Copy link
Author

kendrick-ren commented Oct 22, 2025

The first patch description seems to be misleading. There wasn't any issue with tables UUIDs on the source cluster in the original procedure AFAIK.

If there wasn't please, refer the corresponding GH.

AFAIU the change is required simply because the SM API has changed and there is a way to restore the schema without having to upload system KS data.

Please, reference the corresponding API change in the commit message too.

Other than the API change, I was referring to this GH issue -> scylladb/scylla-manager#3019 (comment) for table UUIDs on the target cluster. Is the that no longer an issue?

become_user: scylla
register: _tables_list
vars:
ansible_pipelining: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we care if Ansible is going to write a temp file?

port: 9042
host: "{{ listen_address }}"


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see it in the first commit.

backrefs: yes

- name: Restore schema from a backup snapshot {{ snapshot_tag }}
hosts: all
Copy link
Collaborator

@vladzcloudius vladzcloudius Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are new lines - not inherited.
And you can use whatever host or a group of hosts with the hosts: ... parameter. Hence the question.

Here you were supposed to use scyllamgr_host as your hosts:

Something like (after you address the corresponding comment below):

...
hosts: scylla-manager
...

And drop all this delegate_to and run_once nonsense.

- name: Get names of the tables in the snapshot {{ snapshot_tag }}
shell: |
scylla-manager-agent download-files -L {{ backup_location }} -n {{ host_id[inventory_hostname] }} -T {{ snapshot_tag }} -d /var/lib/scylla/data/ --dry-run | grep "^\s*\-" | cut -d"-" -f2 | cut -d"(" -f1
scylla-manager-agent download-files -L {{ backup_location }} -n {{ host_id[inventory_hostname] }} -T {{ snapshot_tag }} --dump-manifest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (first) commit does a lot more than changes just the way schema is restored.
It refactors all places that use scylla-manager-agent because its API has change apparently.

The patch description should be adjusted accordingly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added description

with_items: "{{ tables_to_restore }}"
when: item not in system_schema_tables
- name: Restore
block:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

tags:
- upload_snapshot
tasks:
- name: Restore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not restore the original structure - the "block" and the corresponding indentation are still part of the first commit.
Please, remove.

3.66.25.100: aff05f79-7c69-4ecf-a827-5ea790a0fdc6

# Host where Scylladb Manager is run from
scyllamgr_host: <scylla_manager_host_ip, e.g. localhost if the playbook is run on the Scylla Manager host>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of the inventory, not the vars.

Make it a new section in the inventory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

path: /etc/scylla/scylla.yaml
regexp: '^initial_token:'
line: ""
state: absent
Copy link
Collaborator

@vladzcloudius vladzcloudius Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, don't mix independent changes in the same commit: there should be one commit for tagging and one for this hunk.
Each commit with a corresponding description.

block:
- name: Download data
shell: |
scylla-manager-agent download-files -L {{ backup_location }} -n {{ host_id[inventory_hostname] }} -T {{ snapshot_tag }} -d {{ data_dir }} -K '*,!system*' --mode upload
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regular expression here and at the line 51 are not the same.
Hence, if the user has a KS with a name system_my_keyspace you will not download its data but will try to restore it and will report it as a success while in fact you wouldn't restore anything.

Copy link
Author

@kendrick-ren kendrick-ren Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, changed the regular expression in this one.

On the other hand, the regex in line 51 would also exclude the example you gave, line 51 basically excludes all "system_" and "system.".

Hence, I changed to '*,!system_*,!system.*'


- name: refresh nodes with the restored data
shell: |
nodetool refresh {{ item.split('.') | join(' ') }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to work event with the nodetool from 2025.1 since the syntax is now nodetool refresh --keyspace <KS> --table <Table name>:

$ nodetool help refresh
Load newly placed SSTables to the system without a restart


Add the files to the upload directory, by default it is located under
/var/lib/scylla/data/keyspace_name/table_name-UUID/upload.
Materialized Views (MV) and Secondary Indexes (SI) of the upload table, and if
they exist, they are automatically updated. Uploading MV or SI SSTables is not
required and will fail.

For more information, see: https://opensource.docs.scylladb.com/branch-2025.1/operating-scylla/nodetool-commands/refresh.html"


scylla-nodetool options:
  -h [ --help ]                  show help message
  --help-seastar                 show help message about seastar options
  --help-loggers                 print a list of logger names and exit
  -h [ --host ] arg (=localhost) the hostname or ip address of the ScyllaDB
                                 node
  -p [ --port ] arg (=10000)     the port of the REST API of the ScyllaDB node
  --rest-api-port arg            the port of the REST API of the ScyllaDB node;
                                 takes precedence over --port|-p
  --password arg                 Remote jmx agent password (unused)
  --password-file arg            Path to the JMX password file (unused)
  -u [ --username ] arg          Remote jmx agent username (unused)
  --print-port                   Operate in 4.0 mode with hosts disambiguated
                                 by port number (unused)
  --keyspace arg                 The keyspace to load sstable(s) into
  --table arg                    The table to load sstable(s) into

refresh:
  --load-and-stream              Allows loading sstables that do not belong to
                                 this node, in which case they are
                                 automatically streamed to the owning nodes
  --primary-replica-only         Load the sstables and stream to primary
                                 replica node that owns the data. Repair is
                                 needed after the load and stream process

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend considering to use REST API instead - its API is probably less likely to break unlike the nodetool.
Check if the corresponding REST API in 2022.2 is different from the one in 2025.1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, the following doc: https://docs.scylladb.com/manual/branch-2025.1/operating-scylla/nodetool-commands/refresh.html
and https://enterprise.docs.scylladb.com/branch-2024.1/operating-scylla/nodetool-commands/refresh.html

Both shows that the same syntax is still supported, except 2025.1 and beyond has more options added. The test done before on both 2024.1 LTS and 2025.1 LTS/2025.2/2025.3 worked.

In addition, REST API is still marked as BETA in 2024.2, while it looks it is not exposed in 2024.1 LTS official doc at all.

We have customer still on 2024.1 LTS. It is probably better to still use nodetool at the moment.

nodetool refresh {{ item.split('.') | join(' ') }}
with_items: "{{ tables_to_restore }}"

- name: Restart Scylla service to pick up the restored data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to restart Scylla to pick up a restored data? The whole point of nodetool refresh to NOT require a restart.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good one. Took it out.

@kendrick-ren
Copy link
Author

For this comment (#486 (comment)), scylla user may or may not have privilege to write the temp file under a temp folder for Ansible task, when it doesn't, this task will give a warning complaining the privilege, which may confuse customers even it doesn't harm. The ansible_pipelining option would avoid this.

@kendrick-ren
Copy link
Author

for #486 (comment), the cleanups related stuffs are now taken out from the PR.

@kendrick-ren
Copy link
Author

for #486 (comment), fixed.

…uuids on the target cluster

Use Scylla Manager command "restore --restore-schema" (refer to https://manager.docs.scylladb.com/stable/sctool/restore.html
for the detail of the command) to restore user's table instead of using the previous
approach of through restoring system table data. This avoids the issue discussed in scylladb/scylla-manager#3019.

Refactor the task "Get names of the tables in the snapshot" with the new option --dump-manifest and also
replace the previous JSON parsing logic by using Ansible built-in JSON parser to get more more reliable results.

Refactor inventory file to have 2 separate sections, one for Scylla hosts and the other for Scylla Manager host which is required to run
above Scylla Manager "restore --restore-schema" command.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants